1 Introduction

Use CTRL/CMD + Shift + k to preview your markdown. Hit the Visual button, or use CTRL/CMD + Shift + F4 to switch to visual mode, which will let you edit the formatted version in real time.

At the top of the .Rmd file (not shown in rendered result) is the YAML (Yet Another Markup Language) header. It is a human-readable data serialization language. It sets some options for your markdown and gives you a nicely formatted preamble. It is currently set to some of my preferred settings, but feel free to play around with this and make it your own.

We have it set here to output as html, but you can just as easily produce PDF or Word documents. There are a bunch of built-in themes that you can explore here. I quite like the readable theme.

Below the YAML header in the .Rmd document you will find the first code chunk. You will notice that this one does not appear in the rendered document - this is because it is the setup chunk, and it has the option include=FALSE set. We use this to set options for chunk behavior, as well as loading packages and data and such.

2 Syntax

The # above makes a header. A single # is the largest header, and extra #s are smaller headers.

2.1 Smaller Header

This header is automatically numbered because of the YAML settings and the the double #.

2.1.1 Even Smaller Header

This is three #s.

2.1.1.1 Super Tiny Header

This is four #s. Note that it does not show up in the table of contents because we only asked it to keep track of the first three levels.

2.2 More Syntax

Use a single asterisk to make font italic.

Use double asterisks to make font bold.

Note that you need a blank line between paragraphs to split up text. Starting on a new line is not enough.

To make bullet points, use -:

  • A thing
  • Another thing
    • Sub thing

To make numbered lists, use n.:

  1. First thing
  2. Second thing
    • Sub thing

To put code in-line, use back ticks (``)

For multiple lines of verbatim code, use triple back ticks.
x + 1 = y

To make block quotes, use > at the start of the line. Use these to emphasize a point.

3 Code Chunks

Here we will explore some proper code chunks. You can use CTRL/CMD + ALT + I to create a new chunk. After the r comes the chunk name. This is not required, but is convenient if we hit an error because it will tell us where the problem was.

If we want to run code, but not show it, we can use echo=FALSE in the chunk options.

Otherwise, our code chunk will be visible. Let’s show off our example regression from the FSCI paper.

lm <- lm(normvalue ~ year + FSCI_region, data = df)
summary(lm)
## 
## Call:
## lm(formula = normvalue ~ year + FSCI_region, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -31.370  -8.881  -3.519   6.425  72.022 
## 
## Coefficients:
##                                            Estimate Std. Error t value
## (Intercept)                               802.36120   99.04397   8.101
## year                                       -0.39300    0.04928  -7.975
## FSCI_regionEastern Asia                    11.45318    2.49413   4.592
## FSCI_regionLatin America & Caribbean        0.24044    1.75020   0.137
## FSCI_regionNorthern Africa & Western Asia  -3.01052    1.81558  -1.658
## FSCI_regionNorthern America and Europe     -7.85822    2.04028  -3.852
## FSCI_regionOceania                         -0.14376    2.09696  -0.069
## FSCI_regionSouth-eastern Asia               2.58542    1.95749   1.321
## FSCI_regionSouthern Asia                    4.86597    2.03578   2.390
## FSCI_regionSub-Saharan Africa              15.54695    1.69547   9.170
##                                                       Pr(>|t|)    
## (Intercept)                               0.000000000000000845 ***
## year                                      0.000000000000002294 ***
## FSCI_regionEastern Asia                   0.000004607788663308 ***
## FSCI_regionLatin America & Caribbean                   0.89074    
## FSCI_regionNorthern Africa & Western Asia              0.09741 .  
## FSCI_regionNorthern America and Europe                 0.00012 ***
## FSCI_regionOceania                                     0.94535    
## FSCI_regionSouth-eastern Asia                          0.18669    
## FSCI_regionSouthern Asia                               0.01691 *  
## FSCI_regionSub-Saharan Africa             < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.93 on 2484 degrees of freedom
## Multiple R-squared:  0.2395, Adjusted R-squared:  0.2368 
## F-statistic: 86.93 on 9 and 2484 DF,  p-value: < 0.00000000000000022

This shows our code and output, but it is not very nice to look at.

4 Regression Outputs

4.1 Kable

To get a cleaner regression output, we can convert our regression output to a data frame, then use knitr::kable() to create a nice looking table.

lm_df <- broom::tidy(lm)
knitr::kable(lm_df)
term estimate std.error statistic p.value
(Intercept) 802.3611950 99.0439650 8.1010609 0.0000000
year -0.3930000 0.0492762 -7.9754489 0.0000000
FSCI_regionEastern Asia 11.4531769 2.4941272 4.5920580 0.0000046
FSCI_regionLatin America & Caribbean 0.2404370 1.7502015 0.1373768 0.8907441
FSCI_regionNorthern Africa & Western Asia -3.0105244 1.8155793 -1.6581619 0.0974110
FSCI_regionNorthern America and Europe -7.8582226 2.0402820 -3.8515374 0.0001203
FSCI_regionOceania -0.1437598 2.0969580 -0.0685563 0.9453483
FSCI_regionSouth-eastern Asia 2.5854171 1.9574857 1.3207847 0.1866948
FSCI_regionSouthern Asia 4.8659655 2.0357824 2.3902188 0.0169124
FSCI_regionSub-Saharan Africa 15.5469469 1.6954749 9.1696708 0.0000000

Extra steps to get the column names capitalized and the numbers rounded:

lm_df_cleaner <- lm_df %>% 
  dplyr::mutate(across(where(is.numeric), ~ round(.x, 3))) %>% 
  setNames(c(snakecase::to_title_case(names(.))))
knitr::kable(lm_df_cleaner)
Term Estimate Std Error Statistic P Value
(Intercept) 802.361 99.044 8.101 0.000
year -0.393 0.049 -7.975 0.000
FSCI_regionEastern Asia 11.453 2.494 4.592 0.000
FSCI_regionLatin America & Caribbean 0.240 1.750 0.137 0.891
FSCI_regionNorthern Africa & Western Asia -3.011 1.816 -1.658 0.097
FSCI_regionNorthern America and Europe -7.858 2.040 -3.852 0.000
FSCI_regionOceania -0.144 2.097 -0.069 0.945
FSCI_regionSouth-eastern Asia 2.585 1.957 1.321 0.187
FSCI_regionSouthern Asia 4.866 2.036 2.390 0.017
FSCI_regionSub-Saharan Africa 15.547 1.695 9.170 0.000

4.2 sjPlot

For a very clean regresion table with less work, try the sjPlot package:

sjPlot::tab_model(
  lm, 
  p.style = 'stars', 
  digits = 2,
  show.se = TRUE,
  robust = TRUE,
  show.reflvl = TRUE
)
  normvalue
Predictors Estimates std. Error CI
(Intercept) 802.36 *** 104.19 598.06 – 1006.66
Central Asia Reference
Eastern Asia 11.45 *** 3.27 5.05 – 17.86
Latin America & Caribbean 0.24 1.65 -3.00 – 3.48
Northern Africa & Western
Asia
-3.01 1.66 -6.26 – 0.24
Northern America and
Europe
-7.86 *** 1.70 -11.18 – -4.53
Oceania -0.14 1.93 -3.92 – 3.64
South-eastern Asia 2.59 1.73 -0.80 – 5.98
Southern Asia 4.87 ** 1.77 1.40 – 8.33
Sub-Saharan Africa 15.55 *** 1.66 12.28 – 18.81
year -0.39 *** 0.05 -0.49 – -0.29
Observations 2494
R2 / R2 adjusted 0.240 / 0.237
  • p<0.05   ** p<0.01   *** p<0.001

Note that this function takes the lm model object itself, not a data frame. It is designed to work with regression models and provides a ton of options for displaying them. Check out the documentation here. This is where you really learn how to use a package. It is written by the author, with abundant vignettes and examples.

5 Interactive Tables

We’ve already seen how to make tables above. For static tables, knitr::kable() is a good choice. The kableExtra package is also a great extension to knitr, giving you tons of options for customization. See the docs for examples.

For interactive tables, there are a couple of different options.

5.1 DT Table

The DT package is a classic choices for interacitve tables. Note that we are setting echo=FALSE here, so the code chunk will not show up.

This takes almost no code, and creates quite a decent looking table with lots of options.

5.2 Reactable

My personal favorite for interactive tables is reactable. The documentation is excellent, so check it out if you’re interested.

reactable::reactable(
  data = gapminder,
  filterable = TRUE,
  searchable = TRUE,
  outlined = TRUE,
  bordered = TRUE,
  compact = TRUE,
  striped = TRUE,
  showPageSizeOptions = TRUE
)

I find the options for customization here much more intuitive than DT, and the documentation is much easier to use.

6 Plots

6.1 Static Plots

We haven’t really covered plots, but you really just throw your code in the chunk and it will appear.

gapminder %>% 
  filter(year == 2007) %>% 
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent, size = pop)) +
  geom_point() +
  theme_classic() +
  labs(
    x = 'GDP per Capita',
    y = 'Life Expectancy',
    title = 'Life Expectancy against GDP per Capita'
  )

We can change the alignment, size, and resolution of our plot in the chunk options:

gapminder %>% 
  filter(year == 2007) %>% 
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent, size = pop)) +
  geom_point() +
  theme_classic() +
  labs(
    x = 'GDP per Capita',
    y = 'Life Expectancy',
    title = 'Life Expectancy against GDP per Capita'
  )
Caption Goes Here

Caption Goes Here

6.2 Interactive Plots

What about an interactive plot? We can use the very popular plotly package to do this. It is native to python, but the plotly R package gives us an easy way to access it. It has its own syntax, but you can also use the ggplotly() function to convert a ggplot object to a plotly object.

plot <- gapminder %>% 
  filter(year == 2007) %>% 
  ggplot(aes(
    x = gdpPercap, 
    y = lifeExp, 
    color = continent,
    size = pop,
    text = paste0(
      'Country: ', country, '\n',
      'Continent: ', continent, '\n',
      'Life Exp: ', lifeExp, '\n',
      'Population: ', pop
    )
  )) +
  geom_point() +
  theme_classic() +
  labs(
    x = 'GDP per Capita',
    y = 'Life Expectancy',
    title = 'Life Expectancy against GDP per Capita'
  )

ggplotly(plot, tooltip = 'text')

Note that you can hover over points to see more information from out text field, and also move, zoom, select, and download a static image of the plot.